The Kullback-Leibler (KL) divergence is a distance-like measure between the distribution of two random variables. It is defined as

It is not symmetric, so it is not a true measure of distance.

Derivation of the KL divergence

Recall that cross-entropy is defined as

When , this simplifies to to the Shannon entropy,

Hence cross-entropy is not a distance measure between the distributions and , because it does not reach zero when the difference of P and Q is zero. We can obtain a distance-like measure by subtracting the Shannon entropy of the true distribution from the cross-entropy:

Applying the logarithm property that , we obtain the standard definition of the Kullback-Leibler divergence,